[misc] refactor: refactor the folders structure by zhtmike · Pull Request #3 · verl-project/verl-omni

zhtmike · 2026-04-22T08:23:30Z

What does this PR do?

refactor the folders strcutures, add verl dependency

Add concise overview of what this PR aims to achieve or accomplish. Reference related GitHub issues and PRs that help with the review.

Checklist Before Starting

Search for similar PRs. Paste at least one query link here: ...
Format the PR title as [{modules}] {type}: {description} (This will be checked by the CI)
- {modules} include fsdp, vllm_omni, rollout, trainer, ci, training_utils, recipe, ray, worker, single_controller, misc, perf, model, algo, env, tool, ckpt, doc, data, cfg, reward, diffusion, omni, tests, docker
- If this PR involves multiple modules, separate them with , like [diffusion, doc]
- {type} is in feat, fix, refactor, chore, test
- If this PR breaks any API (CLI arguments, config, function signature, etc.), add [BREAKING] to the beginning of the title.
- Example: [BREAKING][diffusion, fsdp] feat: new rollout scheduler

Test

For changes that can not be tested by CI (e.g., algorithm implementation, new model support), validate by experiment(s) and show results like training curve plots, evaluation results, etc.

API and Usage Example

Demonstrate how the API changes if any, and provide usage example(s) if possible.

# Add code snippet or script demonstrating how to use this

Design & Code Changes

Demonstrate the high-level design if this PR is complex, and list the specific changes.

Checklist Before Submitting

Important

Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review.

Read the Contribute Guide.
Apply pre-commit checks: pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=always
Add / Update the documentation.
Add unit or end-to-end test(s) to the CI workflow to cover all the code. If not feasible, explain why: ...

gemini-code-assist

Code Review

This pull request refactors the repository to rename the core package from verl to verl_omni, focusing on RL training for diffusion and omni-modality models. The changes include comprehensive updates to imports, configuration paths, and project metadata, as well as the removal of legacy PPO configurations. Feedback indicates that version requirements for tensordict, vllm, and verl appear incorrect or refer to non-existent public versions, which could lead to installation failures. Additionally, a suggestion was made to update the import path for TrainingWorkerConfig in the test suite to maintain consistency with the new package structure.

SamitHuang · 2026-04-22T10:09:43Z

+        parent.diffusers_impl = _omni_impl
+
+
+def apply_patches() -> None:


Maybe we can add log info here, to remind dropping these patches in future version

SamitHuang · 2026-04-22T10:10:03Z

+cachetools
 codetiming
 datasets
+diffusers


specify the version?

SamitHuang · 2026-04-22T10:11:05Z

    __version__ = f.read().strip()

 install_requires = [
+    "verl>=0.7.1",


did you try to install in this way?

No, we hold a temporally version here. I think it should be formally supported with 0.8.0?

@Package

…s — client wiring, Hydra rebuild, multi_modal_inputs, eval + docs Round 3 of the Qwen3-TTS GRPO recipe. Addresses all four Codex round-2 findings and lands T15 / T16 / T17 / T18 stubs. 111 recipe-scope tests pass. Codex round-2 blocker fixes: verl-project#1 — Real client wiring. New verl_omni/workers/rollout/_tts_client_patch.py monkey-patches LLMServerManager.get_client to return AutoRegressiveTTSServerClient when rollout.name == "vllm_omni_tts" (falls through to upstream LLMServerClient otherwise). Patch is idempotent and triggered at import time from verl_omni/workers/rollout/replica.py. New tests in tests/workers/test_tts_client_patch.py prove the upstream LLMServerManager now returns a client with generate_tts() for the AR-TTS rollout name. verl-project#2 — Hydra config tree rebuilt to compose correctly. The previous subdirs (actor/rollout/ref/reward) used "# @Package _global_" which double-nested every section, so the trainer saw keys like "rollout.actor_rollout_ref.rollout" instead of "actor_rollout_ref.rollout". Deleted the broken subdirs and replaced the recipe with a single top-level verl_omni/trainer/config/qwen3_tts_trainer.yaml that: - Inherits the diffusion recipe's full defaults tree (diffusion/{actor,ref,rollout,model,model_engine} + data + reward) so all upstream-required sections (ray_kwargs, etc.) are present. - Overrides actor_rollout_ref.rollout for vllm_omni_tts + AR-TTS agent loop / manager. - Overrides algorithm with upstream verl AlgoConfig (adv_estimator=grpo). - Overrides reward.reward_manager to AsrErrorRateRewardManager. - Wires data.custom_cls = Qwen3TTSDataset. verl_omni/trainer/qwen3_tts_grpo/main.py adjusted to load from ../config with config_name=qwen3_tts_trainer. New tests in tests/trainer/test_qwen3_tts_hydra_compose.py compose the shipped yaml via Hydra, apply launch overrides, and verify: - Top-level keys land at expected paths (no double-nesting). - AR-TTS overrides survive composition. - The recipe validator accepts the composed config. - The validator rejects missing ASR URL, flow_grpo, co_located ASR, and n<2 at the real composition layer (not synthetic dicts). verl-project#3 — multi_modal_inputs in worker output. verl_omni/agent_loop/autoregressive_tts_agent_loop.py:_postprocess() now emits a multi_modal_inputs object-array column (empty dict per row, audio recipes have no image/video inputs) plus __num_turns__, so upstream RayPPOTrainer.training_step's unconditional batch.non_tensor_batch["multi_modal_inputs"] iteration succeeds. verl-project#4 — Recipe artifacts finished, no further deferrals: - examples/qwen3_tts_grpo_trainer/eval.sh + run_full.sh + verl_omni/trainer/qwen3_tts_grpo/run_eval.py. - run_eval.py writes eval_results.json with the AC-7 schema (base/RL CER + median/mean duration_ratio), enforces target_utt_id disjointness, and supports mock-inference JSON for offline scoring. - verl_omni/utils/validation_audio_logger.py implements AC-8: writes >=4 generated + ref + (optional target) wavs per validation step plus metrics.json; ArtifactWriteError on disk failures; post-run helper flags zero-artifact steps. - docs/recipe/qwen3_tts_grpo.md documents pipeline, layout, smoke, eval, full run, hardware, validation logging. - examples/qwen3_tts_grpo_trainer/README.md updated — open-items section replaced with eval / full-run / audio-logging sections. Test totals (recipe scope): 111 passed. - dataset 9, audio output 4, reward 19, pairing 4, validator 9, hydra compose 7, run_eval 2, client patch 4, audio logger 5, plus pre-existing recipe-adjacent tests. Still pending: T1 / T2b / T3 / T14 — live-engine smoke against the downloaded Qwen3-TTS-12Hz-0.6B-Base + a running Qwen3-ASR server. Round 4's focus. AI assistance: Claude Code (Opus 4.7, 1M context). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Yuekai Zhang <zhangyuekai@foxmail.com>

@Package

…t, validation logger wired, real eval path, T1 partial Round 4 of the Qwen3-TTS GRPO recipe. Addresses every Codex round-3 blocker and runs T1's static checkpoint verification. Round-3 blocker fixes: verl-project#1 — Real eval path (B1). run_eval.py no longer NotImplementedErrors on the default launcher path. _live_inference_split() spins up AsyncOmni for each checkpoint, drives one Qwen3-TTS rollout per eval row, captures the generated waveform + duration, and feeds it into _score_inference() which transcribes via the remote ASR endpoint and computes per-row CER + duration_ratio. --mock-inference-json retained as an opt-in TEST-ONLY path that bypasses engine + ASR (used by tests/trainer/test_qwen3_tts_run_eval.py). verl-project#2 — Validation logger wired into runtime (M2). AutoRegressiveTTSAgentLoopWorker.__init__ resolves the validation output dir from config.trainer.validation_data_dir (falling back to default_local_dir/validation_audio) and stores log_validation_step + post_run_check_emitted_artifacts. New _emit_validation_artifacts(...) is called from generate_sequences when batch.meta_info["validate"] is True, persisting AC-8 artifacts (>=4 generated wavs + ref + optional target + metrics.json) per validation step. New tests/agent_loop/test_ar_tts_validation_logging.py exercises the worker call path end-to-end (no GPU). verl-project#3 — Canonical Hydra layout restored (M3). Moved trainer yaml back to verl_omni/trainer/config/qwen3_tts/qwen3_tts_trainer.yaml (per AC-9 and Codex round-3 finding). Two fixes needed to keep composition correct in the subdir: - `# @Package _global_` on the trainer yaml itself, so its content goes to the root namespace (not under qwen3_tts.*). - All defaults use absolute paths (`/diffusion/...`, `/data@data`, `/reward@reward`) so groups resolve against the config search root, not against the qwen3_tts/ subdir. - Hardcoded `dp_diffusion_actor` / `dp_diffusion_ref` instead of the `${diffusion/model_engine}` interpolation, which breaks in the subdir (Hydra namespaces the group choices under qwen3_tts.*). main.py adjusted to config_name="qwen3_tts/qwen3_tts_trainer". Hydra composition tests pass against the canonical layout (7/7). verl-project#4 — T1 partial (analyze gate). Round-4 verifies the static contract: `safetensors.safe_open('.hf_cache/Qwen3-TTS-12Hz-0.6B-Base/model.safetensors')` reports 478 total keys, 76 of them under `speaker_encoder.*`. The model supports voice cloning per the plan. The standalone scripts/qwen3_tts_smoke.py runs the full T1 + T2b verification (live engine launch); it is shipped so round 5 (or a human) can execute the live half against a real ref_audio. T1's fail-closed-if-absent contract is preserved by the script. Files changed: 6 modified, 3 new (scripts/qwen3_tts_smoke.py, tests/agent_loop/test_ar_tts_validation_logging.py, restored qwen3_tts/qwen3_tts_trainer.yaml). 113 recipe-scope tests pass (112 R3 + 2 new R4 worker validation logger tests, less 1 because test_run_eval.py is now stricter about the mock path). Still pending: T1 part 2 (live AsyncOmni inference), T2b (stage-0 logprobs in OmniRequestOutput), T3 (remote ASR HTTP smoke), T14 (end-to-end run_smoke.sh). All four require launching the actual servers, which round 5 will attempt. AI assistance: Claude Code (Opus 4.7, 1M context). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Yuekai Zhang <zhangyuekai@foxmail.com>

@Package

…dirs, AC-6 metrics in metrics.json, fail-closed speaker check, post-run AC-8 check Round 5 of the Qwen3-TTS GRPO recipe. Addresses every Codex round-4 action item except T1/T2b/T3/T14 live-engine runs. verl-project#1 — Canonical packaging finished (T11). The qwen3_tts/ subdir now contains the per-role files the plan and AC-9 require: verl_omni/trainer/config/qwen3_tts/ ├── qwen3_tts_trainer.yaml (top-level, # @Package _global_) ├── actor/qwen3_tts_actor.yaml (PPO mini/micro batch overrides) ├── ref/qwen3_tts_ref.yaml (KL disabled in v1) ├── rollout/qwen3_tts_rollout.yaml (vllm_omni_tts + AR agent loop) └── reward/qwen3_tts_reward.yaml (AsrErrorRateRewardManager + remote ASR) The trainer yaml's defaults now load each role twice: the diffusion / shared base first, then the qwen3_tts/<role> file on top. Hydra merges the second over the first at the same target path. Composition tests (7/7) pass against the canonical layout. README and docs paths updated to match. verl-project#2 — AC-6 metrics in validation metrics.json (T16). The worker's _emit_validation_artifacts now writes a stable schema with the full AC-6 scalar set (mean_reward, mean_cer, mean_duration_ratio, policy_loss, kl_loss) — mean_reward / mean_cer / mean_duration_ratio are populated from per-sample rewards / CER / waveform durations; policy_loss / kl_loss are null at validation steps (no parameter update happens during validation). Test: tests/agent_loop/test_ar_tts_validation_logging.py now asserts every AC-6 key is present and policy_loss/kl_loss are explicitly null. verl-project#3 — Post-run AC-8 check wired into runtime (T16). main.py's hydra entry now wraps verl.trainer.main_ppo.run_ppo with a try/finally that resolves the validation artifact dir (config.trainer.validation_data_dir or default_local_dir/validation_audio) and calls post_run_check_emitted_artifacts(...) when the dir exists. A run with any validation_step_* directory containing zero generated audio files raises ArtifactWriteError at the end of the run, regardless of whether training itself succeeded. Tests: tests/trainer/test_qwen3_tts_post_run_check.py covers the resolver, the happy path, and the empty-step fail-closed path. verl-project#4 — Fail-closed _check_speaker_encoder (T1 helper). The Hub-id branch that previously returned True when local weights couldn't be found (round-4's fail-open shortcut) now returns False with a clear stderr message instructing the operator to download the checkpoint first. Tests: tests/trainer/test_qwen3_tts_smoke_helper.py covers four cases: no-local-checkpoint, no-speaker-encoder, speaker-encoder-present, unreadable-file. Manual verification: smoke._check_speaker_encoder('.hf_cache/Qwen3-TTS-12Hz-0.6B-Base') → True (real downloaded checkpoint, 76 speaker_encoder keys) smoke._check_speaker_encoder('Qwen/Qwen3-TTS-12Hz-0.6B-Base') → False (Hub id without local download; fail-closed) Tests: 122 recipe-scope tests pass (118 R0-R4 + 4 R5 helper/post-run/AC-6). Live engines (T1 part 2, T2b, T3, T14) — still pending. The structural plumbing is in place: scripts/qwen3_tts_smoke.py drives the full T1+T2b verification against a real engine; the recipe trainer wraps upstream main_ppo with the AC-8 post-run check; the eval pipeline drives AsyncOmni for both checkpoints. Round 6 (or a human operator) executes these against live processes. Signed-off-by: Yuekai Zhang <zhangyuekai@foxmail.com>

zhtmike added 2 commits April 22, 2026 16:19

refactor the folders strcutures

93cd0c9

revert change

879455a

gemini-code-assist Bot reviewed Apr 22, 2026

View reviewed changes

Comment thread setup.py

Comment thread requirements.txt

Comment thread tests/models/test_diffusers_fsdp_engine.py

zhtmike changed the title ~~refactor the folders strcutures~~ [misc] refactor: refactor the folders structure Apr 22, 2026

zhtmike added 10 commits April 22, 2026 16:45

fix UT

80ed21a

fix CI

10724ea

fix CI

bdd0ede

fix UT

958980d

fix CI

917fa41

fix UT

31a421e

fix CI

9ef6f16

fix UT

2d4237e

fix UT

3bc0caf

fix UT

4b5520c

SamitHuang mentioned this pull request Apr 22, 2026

[doc] chore: supply documentation for flowgrpo training #2

Merged

6 tasks

clean

ae67b0c

SamitHuang reviewed Apr 22, 2026

View reviewed changes

zhtmike added 3 commits April 22, 2026 18:15

add warning message

339d403

add version

89890c9

drop verl requirement for now

2ba95ac

SamitHuang merged commit 411d1d6 into main Apr 22, 2026
4 checks passed

zhtmike deleted the refactor_folder branch April 23, 2026 02:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[misc] refactor: refactor the folders structure#3

[misc] refactor: refactor the folders structure#3
SamitHuang merged 16 commits into
mainfrom
refactor_folder

zhtmike commented Apr 22, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

SamitHuang Apr 22, 2026

Uh oh!

zhtmike Apr 22, 2026

Uh oh!

SamitHuang Apr 22, 2026

Uh oh!

zhtmike Apr 22, 2026

Uh oh!

SamitHuang Apr 22, 2026

Uh oh!

zhtmike Apr 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		parent.diffusers_impl = _omni_impl


		def apply_patches() -> None:

Conversation

zhtmike commented Apr 22, 2026

What does this PR do?

Checklist Before Starting

Test

API and Usage Example

Design & Code Changes

Checklist Before Submitting

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

SamitHuang Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

zhtmike Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

SamitHuang Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

zhtmike Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

SamitHuang Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

zhtmike Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants